filter_lua: Add chunk mode for processing multiple records #8478

drbugfinder-work · 2024-02-12T14:44:51Z

This PR will introduce a chunk_mode for lua filter. It can be needed for use cases like parallelization (see lua lanes).

Please note that the lua functions will take only two arguments:

function process_records(tag, records)
  if records and type(records) == "table" then
    for i, record_row in ipairs(records) do
        local timestamp = record_row.timestamp
        local record = record_row.record

        print("Timestamp entry:", timestamp.sec, timestamp.nsec)
        print("Record entry:", record.message)
    end
  else
    print("Error: Invalid 'records' table or nil")
  end
  return records
end

It's configuration looks like this:

[FILTER]
    Name          lua
    Match         my_logs
    script        lanes_example.lua
    call          process_records
    chunk_mode    On
    time_as_table On

The returned table must be in the same format (table of timestamp and record pairs).

This mode currently only supports time_as_table by default and does always emit the returned records. There is no return code to be set.

A use case for this can be the parallel execution of lua filters by using the lua lanes library.
Please see example here (remember to install lua lanes first e.g. apt install luarocks && luarocks install lanes and check the path in the lanes_example.lua)

fluent-bit.conf: https://gist.github.com/drbugfinder-work/63b6992247cfc3f865ee08f01c3d4b64
lanes_example.lua: https://gist.github.com/drbugfinder-work/252bc103cd8efe6b0153242df8494978

Please see valgrind output:

https://gist.github.com/drbugfinder-work/54eab5d992c902cc067a07e29d03839e

Documentation PR:
fluent/fluent-bit-docs#1310

Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

Example configuration file for the change
Debug log output from testing the change

Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

[N/A] Run local packaging test showing all targets (including any new ones) build.
[N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

Documentation required for this feature

Backporting

[N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Documentation for fluent/fluent-bit#8478 Signed-off-by: Richard Treu <richard.treu@sap.com>

This commit will introduce a chunk_mode for lua filter. It can be needed for use cases like parallelization (see lua lanes). Please note that the lua functions will take only two arguments: function process_records(tag, records) if records and type(records) == "table" then for i, record_row in ipairs(records) do local timestamp = record_row.timestamp local record = record_row.record print("Timestamp entry:", timestamp.sec, timestamp.nsec) print("Record entry:", record.message) end else print("Error: Invalid 'records' table or nil") end return records end The returned table must be in the same format (table of timestamp and record pairs). This mode currently only supports time_as_table by default and does always emit the returned records. There is no return code to be set. Signed-off-by: Richard Treu <richard.treu@sap.com>

agup006 · 2024-02-22T17:21:02Z

@tarruda would you take a look?

tarruda · 2024-02-23T12:57:58Z

@drbugfinder-work Thanks for this PR.

I skimmed quickly through the PR and lua lanes documentation in order to understand the end goal, and I assume this is how everything will work:

Chunked mode only difference is that it can pass multiple records at once to the lua callback.
Lua lanes is a library that allows you to spread processing across multiple OS threads by spawning multiple Lua states (one for each thread), running the function in each thread and then returning the results to the main thread.
Since the callback receives multiple records, your end goal is to split the records so that they would be processed in parallel.

(please let me know if I missed something):

With that in mind, I have some questions:

How many records can fluent-bit pass to the Lua filter at once?
Have you done any benchmarking with a big volume of records and compare with the traditional approach?
Will Lua lanes create a new thread every time you ask it to run a function?
How is the data shared between threads in lua lanes (is there serialization/deserialization to pass records across threads)?

The reason these questions are important is that this approach might have a lot of overhead and this adds quite a lot of complexity to the lua_filter implementation. Spawning threads is expensive, and only worth if you have a huge amount of data. If fluent-bit can never pass more than a few hundred records to the lua callback, I suspect splitting the workload across thread will make things even slower (this is why I suggest benchmarking).

drbugfinder-work · 2024-02-23T13:29:45Z

Hi @tarruda,

you're absolutely right with that summary. The main goal is to parallelize record processing in Lua. However, there might be other use cases that could make use of processing more than one record at once within Lua (e.g. trend analysis).

I'm testing this in different environments, especially in K8s clusters with a couple hundreds instances of Fluent Bit. In high load scenarios, I can see chunks of up to around 300-500 records coming into the Lua filter per call. It fluctuates between 1 to 50 records per filter call under normal circumstances, depending on the log volume at the pipeline and Flush settings.
However, only having one CPU core is no longer the limit, so now I can see up to 4 cores totally being utilized in our environment per Fluent Bit instance. Of course there is some overhead by using the Lua Lanes library and creating threads, etc. but it definitely increases our maximum throughput by roughly 200-300%, as seen in our benchmarks. The limiting factor is the main Fluent Bit pipeline again and not the Lua script anymore.

Please keep in mind, that this change to the plugin does not make a direct use of the Lua Lanes library - that is just one use case. The library has to be installed by the user, in case it should be used. The implementation on how to distribute the records within the Lua into different worker threads is up to the user. I've added a simple (non worker thread limited) example to the documentation to show how to create a thread for every incoming record (this would of course create a thread on the OS for every record - which might or might not be performing well). The user could use the Linda objects of the Lanes lib to share data between threads, if this is needed by the user.

I don't think that this adds much complexity to the lua filter plugin itself, as it just passes a table of records instead of a single record to Lua - but it gives you the option to make use of all the available data in that chunk.

tarruda

@drbugfinder-work Since you are seeing an increase in throughput and this change is backwards compatible, LGTM

github-actions · 2024-06-08T01:51:22Z

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

drbugfinder-work · 2024-07-04T07:13:36Z

Over the last months we noticed a memory leak (not detected by valgrind) by this change.
Please do not merge until this is fixed.

drbugfinder-work requested review from edsiper, leonardo-albertovich, fujimotos and koleini as code owners February 12, 2024 14:44

github-actions bot added the docs-required label Feb 12, 2024

drbugfinder-work temporarily deployed to pr February 12, 2024 14:45 — with GitHub Actions Inactive

drbugfinder-work mentioned this pull request Feb 12, 2024

Output Processor Multi-Threading not working as expected (Mutex wait) #8245

Closed

drbugfinder-work temporarily deployed to pr February 12, 2024 15:07 — with GitHub Actions Inactive

drbugfinder-work added a commit to drbugfinder-work/fluent-bit-docs that referenced this pull request Feb 12, 2024

filter_lua: Add chunk mode for processing multiple records

d52d17e

Documentation for fluent/fluent-bit#8478 Signed-off-by: Richard Treu <richard.treu@sap.com>

drbugfinder-work mentioned this pull request Feb 12, 2024

filter_lua: Add chunk mode for processing multiple records fluent/fluent-bit-docs#1310

Open

drbugfinder-work force-pushed the lua-chunk-mode branch from 647db35 to 8beca2c Compare February 12, 2024 22:25

drbugfinder-work temporarily deployed to pr February 12, 2024 22:25 — with GitHub Actions Inactive

drbugfinder-work temporarily deployed to pr February 12, 2024 22:46 — with GitHub Actions Inactive

drbugfinder-work force-pushed the lua-chunk-mode branch from 8beca2c to ce873bb Compare February 13, 2024 07:52

drbugfinder-work temporarily deployed to pr February 13, 2024 07:53 — with GitHub Actions Inactive

drbugfinder-work temporarily deployed to pr February 13, 2024 08:12 — with GitHub Actions Inactive

drbugfinder-work added a commit to drbugfinder-work/fluent-bit-docs that referenced this pull request Feb 19, 2024

filter_lua: Add chunk mode for processing multiple records

9577eb0

Documentation for fluent/fluent-bit#8478 Signed-off-by: Richard Treu <richard.treu@sap.com>

drbugfinder-work force-pushed the lua-chunk-mode branch from ce873bb to a956e55 Compare February 20, 2024 16:59

drbugfinder-work temporarily deployed to pr February 20, 2024 17:00 — with GitHub Actions Inactive

drbugfinder-work temporarily deployed to pr February 20, 2024 17:24 — with GitHub Actions Inactive

drbugfinder-work force-pushed the lua-chunk-mode branch from a956e55 to a5a569f Compare February 21, 2024 13:41

drbugfinder-work temporarily deployed to pr February 21, 2024 13:42 — with GitHub Actions Inactive

drbugfinder-work temporarily deployed to pr February 21, 2024 14:07 — with GitHub Actions Inactive

drbugfinder-work force-pushed the lua-chunk-mode branch from a5a569f to 46b4f59 Compare February 22, 2024 14:05

drbugfinder-work temporarily deployed to pr February 22, 2024 14:05 — with GitHub Actions Inactive

drbugfinder-work temporarily deployed to pr February 22, 2024 14:06 — with GitHub Actions Inactive

drbugfinder-work temporarily deployed to pr February 22, 2024 14:28 — with GitHub Actions Inactive

tarruda approved these changes Feb 23, 2024

View reviewed changes

github-actions bot added the Stale label Jun 8, 2024

github-actions bot removed the Stale label Jul 11, 2024

drbugfinder-work closed this Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

filter_lua: Add chunk mode for processing multiple records #8478

filter_lua: Add chunk mode for processing multiple records #8478

drbugfinder-work commented Feb 12, 2024 •

edited

Loading

agup006 commented Feb 22, 2024

tarruda commented Feb 23, 2024

drbugfinder-work commented Feb 23, 2024 •

edited

Loading

tarruda left a comment

github-actions bot commented Jun 8, 2024

drbugfinder-work commented Jul 4, 2024

filter_lua: Add chunk mode for processing multiple records #8478

filter_lua: Add chunk mode for processing multiple records #8478

Conversation

drbugfinder-work commented Feb 12, 2024 • edited Loading

agup006 commented Feb 22, 2024

tarruda commented Feb 23, 2024

drbugfinder-work commented Feb 23, 2024 • edited Loading

tarruda left a comment

Choose a reason for hiding this comment

github-actions bot commented Jun 8, 2024

drbugfinder-work commented Jul 4, 2024

drbugfinder-work commented Feb 12, 2024 •

edited

Loading

drbugfinder-work commented Feb 23, 2024 •

edited

Loading